Goto

Collaborating Authors

 molecular representation





Learning Invariant Molecular Representation in Latent Discrete Space Xiang Zhuang

Neural Information Processing Systems

Molecular representation learning lays the foundation for drug discovery. However, existing methods suffer from poor out-of-distribution (OOD) generalization, particularly when data for training and testing originate from different environments.





884d247c6f65a96a7da4d1105d584ddd-Supplemental.pdf

Neural Information Processing Systems

To push the boundaries of molecular representation learning, we present PhysChem, a novel neural architecture that learns molecular representations via fusing physical and chemical information of molecules.



May the Force be with You: Unified Force-Centric Pre-Training for 3D Molecular Conformations

Neural Information Processing Systems

Recent works have shown the promise of learning pre-trained models for 3D molecular representation.However, existing pre-training models focus predominantly on equilibrium data and largely overlook off-equilibrium conformations.It is challenging to extend these methods to off-equilibrium data because their training objective relies on assumptions ofconformations being the local energy minima. We address this gap by proposing a force-centric pretraining model for 3D molecular conformations covering both equilibrium and off-equilibrium data.For off-equilibrium data, our model learns directly from their atomic forces. For equilibrium data, we introduce zero-force regularization and forced-based denoising techniques to approximate near-equilibrium forces.We obtain a unified pre-trained model for 3D molecular representation with over 15 million diverse conformations. Experiments show that, with our pre-training objective, we increase forces accuracy by around 3 times compared to the un-pre-trained Equivariant Transformer model. By incorporating regularizations on equilibrium data, we solved the problem of unstable MD simulations in vanilla Equivariant Transformers, achieving state-of-the-art simulation performance with 2.45 times faster inference time than NequIP. As a powerful molecular encoder, our pre-trained model achieves on-par performance with state-of-the-art property prediction tasks.